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A redshift distortion free correlation function at third order in the 
nonlinear regime 
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ABSTRACT 



The zeroth-order component of the cosine expansion of the projected three-point correla- 
tion function is proposed for clustering analysis of cosmic large scale structure. These func- 
tions are third order statistics but can be measured similarly to the projected two-point corre- 
lations. Numerical experiments with N-body simulations indicate that the advocated statistics 
are redshift distortion free within 10% in the non-linear regime on scales ~ 0.2 — 10/i _1 Mpc. 
Halo model prediction of the zeroth-order component of the projected three-point correlation 
function agrees with simulations within ~ 10%. This lays the ground work for using these 
functions to perform joint analyses with the projected two-point correlation functions, explor- 
ing galaxy clustering properties in the framework of the halo model and relevant extensions. 
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1 INTRODUCTION 

Observed large scale structure in the Universe is generally con- 
jectured to arise from Gaussian initial condition or nearly so; the 
rather high level non-Gaussianity at present is due to the action 
of gravitational force and gas physics. The three-point correla- 
tion function (3PCF) is of the lowest order among correlation 
functions capable of probing such non-Gaussianity. With the re- 
cent increase of interest and the corresponding attempts to extract 
more information about structure formation processes and primor- 
dial non-Gaussianity from fine clustering patterns of galaxies, the 
3PCF (or its counterpart in Fourier space, bispectrum) has attracted 
much attention in recent years (e .g Kavo et al. 2004; Nichol 2006; 



ISmith et al.ll2008l ; |jeong & Komatsul2009l;ISefusattill2009t) . 

However, 3PCF is well known for its low return of investment 
compared with the two-point correlation function (2PCF). One ma- 
jor obstacle hindering the interpretation and consequently the ap- 
plication of 3PCF is the redshift distortion induced by the pecu- 
liar velocities of galaxies. Although effects of redshift distortion on 
2PC F (or power spectru m) are not yet well understood analytically 
(e.g. lScoccim arro 2004), approximations by incorporating pairwise 
velocity distribution have been p roposed, validated and applied 
successfully to statistical analyse s (Peebles 1980; Davis & Peebles! 
19831; IWhitdl200ll ; ISeliakl l200ll ; iKang et al.1 12001; iTinkerl 120071 : 



Smit h et al . 2008). In the case of 3PCF (or bispectrum) analogous 



approach would involve higher order statistics of peculiar veloc- 
ities. The complicated entanglement of redshift distortions with 
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nonlinear gravitational dynamics and nonlinear biasing renders the- 
oretical prediction extremely difficult in configuration space. In 
Fourier space and with the distant observer approximation, pre- 
diction of the bispectrum in redshift space in various perturbative 
and empirical schemes has been moderately successful, although 
none ha ve been able to show sa t isfactory agreement with sim- 
ulations jMatsubara & Sutolll994l ; iHivon et all 1 19951 ; IVerde et all 
ll998l ; IScoccimarro et al.lll9*99 ). The mostly a ccurate model to date 
appears to be the work of ISmith et alj d2008h . a halo model exten- 
sion implemented with higher order perturbation theory. 

One can eliminate the complexity of redshift distortion with 
projection of the correlation functions upon the plane perpendicu- 
lar to the line-of-sight (LOS). Projected correlation functions are 
obtained by integrating over the anisotropic correlation functions 
along LOS, which effectively removes redshift distortions if the 
conservation of total number of galaxy pairs and triplets along LOS 
can be satisfied. Since thickness of a realistic sample is finite, galax- 
ies near radial edges could enter or leave the sample space by their 
apparent movement due to peculiar velocity, such conservation is 
only approximately achieved if the sample is shallow, or redshifts 
are photometric. Violation of the conservation co ndition may brin g 
non-negligible systematical bias on large scales dNock et aLi 2010). 
Nevertheless, this is not a problem for most modern spectroscopic 
galaxy samples, and the bias actually can be minimized by careful 
design of estimation methodology. 

In comparison with the projected 2PCF that has been widely 
used to investigate clustering dependence on galaxy intrinsic prop- 
erties, evolution history and environment and to distinguish cos- 
mological models (e.g. lHawkins"et ai] |2003l; Izheng & Weinberg! 
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l2007l ; lBaldauf et alJl20ld : IZehavi et afll20ld) . exploration and ap- 
plic ation of the projected 3PCF has been limited in the litera - 
ture djing & B6rnerll998l.l2004IZhengl2004lMcBride et alJ201Ch . 
Lack of accurate theoretical models of 3PCF prevents p r oper in ter- 
pretation of measurements. IScoccimarro & Couchmanl d200lh of- 
fered a phenomenological model based on hyper-extended pertur- 
bation theory for the bispectrum in the nonlinear regime. Their 
fitting formula is accurate on smaller scales but in the weakly 
and mildly no nlinear regimes i t is improved upon by the empir- 
ical model of |Pan etaL r< l2007t) . Both fail on very small scales, 
and neither can captur e the signal of bary onic oscillation in bis- 
pectrum appropriately (Sefusatti et al. 2010). The approach of halo 
model appears more promising, as it can repro duce most measur e- 
ments in simulations for the bispectrum (e.g. |M a & Fry) 12000^15 
IScoccimarro et alj|200ll ; ISmith et al.|[20qdl2008l) and the 3PCF in 
config uration space (e.g. iTakada & Jainl 120031 ; IWang et alj|2004 
Fosalba et al. 2005). In spite of disagreement with simulations for 
some configurations of 3PCF, the halo model is still more attractive 
than the phenomenological models for its clean and physically mo- 
tivated parametrization to galaxy biasing throug h e.g. the machin- 
ery of the halo occupation distribution (HOD. lBerlind & Weinberg 
l2002h . 

Another reason for the scarce exploration of projected 3PCF is 
the complexity of estimation. Computational requirement of 3PCF 
is demanding for currently available computers when millions of 
points are typical. The additional task of decomposing the separa- 
tions among three points for projected 3PCF adds to the CPU load. 
Furthermore, the 3PCF is already more prone to Poisson noise than 
the 2PCF, and typical bin width of scales for projected 3PCF is even 
smaller than for the normal 3PCF. In order to suppress discreteness 
effects for a reliable estimation, a high number density of points in 
the sample is crucial, but often unrealistic for real surveys. 

By analo g y to the monople of 3PCF advocated by 
iPan & Szapudi feo05altih . we show that a third-order statistical 
function similar to the angular average of the projected 3PCF is 
redshift distortion free and relatively easy to estimate and model 
theoretically. In the next section, the definitions, and relation with 
3PCF together with estimation algorithm is described. Section 3 
presents numerical properties of the new statistical measure while 
in section 4 we demonstrate the consistency of halo models to sim- 
ulations of the new function. Summary and discussion are in the 
last section. 



2 PROJECTED THREE-POINT CORRELATION 

FUNCTION AND ITS ZEROTH-ORDER COMPONENT 

Let r = X2 - xi be the vector pointing to a point at position X2 
from point at xi, the vector can be decomposed to two components, 
separation along the line-of-sight (LOS) n = r/i with fi being the 
cosine of the angle between the LOS and r, and separation perpen- 



dicular to LOS a 



(r 8 



2X1/2 



then we have the anisotropic 



2PCF £(<7, 7r) and so the projected 2PCF 



=2 



£(<T, 7r)d7T 

+°° r£(r)dr 



s£(s)ds 
Vs 2 - a 2 



(1) 



where s is the separation vector between two points measured in 
redshift space and the last step comes from conservation of total 
number of pairs along LOS. Inversion of H(cr) could directly render 



2PCF £(r) altho ugh inversion of such A bel integration is unstable 
mathematically ( Davis &Peeblesll 19831) . 

Similarly, giving three points at xi, X2 and X3, 3PCF 
C( r ii r 2,T'3) is of the the triangle configuration with three sepa- 
rations ri = X2 — xi = ((Ji, 7Ti), r-2 = x 3 — X2 = ((72, 7T2) and 
r3 = xi — X3 = ((73, 7T3), decomposition of the three separations 
bring up anisotropic 3PCF £((71,2,3; 7ri,2,3) with Y) 71-1,2,3 = , 
and the projected 3PCF is just defined as djing & B6rneJ[l998l 
|2004 Zheng 20(3) 



Z(<Jl, (72, CT3) 



/ + 00 r-\ 
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rir 2 [C(ri, r 2 ,r+) + C(n,r 2 ,r 3 )] 



in which 



r s = V^a + (M + M) 



dndr2 , 
(2) 



(3) 



r 3 = v CT 3 + (kil - M) 2 ■ 

ISzapudil d2004al) pointed out that 3PCF can be expanded with 
Legendre polynomials Pi to isolate part of the configuration depen- 
dence, 



C(ri,ra,0) = ^ —. (t(ri,r 2 )Pe(c 



&(ri,r 2 ) = 27r / C(n,r a ,0)P<(cos0)dcos0 ; 



(4) 



in which cos 9 = — n • r2 / (nra). In the expansion the monople £0 
is of particular inte rests for its relatively s implicity in measurement 
and interpretation dPan & Sz apudi 2005a b). One can easily found 
that fo is actually the spherical average of £ in three-dimensional 
space 



Co(n,r 2 ) /C(n,r2,0)27rsiii0d0 / Cdfi 



, (5) 



4.7T 4.7T J dQ 

which effectively beco mes theoretical support to the estimator in 
IPan & SzapudU d2005bl) . 

In the same spirit, the projected 3PCF Z also can be expanded 
but in a different treatm ent, the cosine Four ier transformation pro- 
posed bv lZhend d2004l) and lSzapudil d2009l) is the appropriate one 
since Z is defined on a two-dimensional plane which is perpendicu- 
lar to LOS. Angular averaging of Z thus pr oduces the zeroth-order 
comp onent of the cosine expansion to Z (Zheng 2004; Szapudi 
120091) . 

Zo(<ri, (72 ) = -!- f Z((Ti,a 2 ,O p )d0 p (6) 
^ Jo 

with dp = cos~ 1 [((7i + a\ — I Q(T\o%f\, which is the object 
function that we focus on and actually is related to £ by 

r2n ^+00 ^ + 00 

/ C( o "i) o "2,0 P ;7ri,7r2)d7rid7r2 

J —OO 

Co(o"i, era, 7Ti, 7r2)d7rid7r 2 



2.' ^ 



J —00 -J —00 

00 /■ + 00 



00 J —00 



(7) 



where 



Co(fi,CT2,7ri,7r2) = — / C(fi,f2,0 P ;vri,7r2)d0p . (8) 
27r Jo 

Note that fo and the monople of 3PCF £0 are not equal at all. 
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Theoretically if the nonlinear bispectrum is known, by the co- 
sine transformation 



B(ki,k 2 ,(f>) = ^2 B„(fa,fe)cos(n(/>) 



n— — oo 



(9) 



1 

B n {ki,ki) = — I B(k 1 ,k 2 ,4>)cos(n4>)d<j) . 
2i" Jo 

it is fairly straightforward to compute Zo dZhendl2004h 



B (k 1 ,k 2 ) 



(10) 



(2tt) .,„ .,„ 

X Jo(fel(Ji)Jo(fc2(J2)felfc2dfcldfc2 , 

where Jo is the zero-order Bessel function of the first kind. An 
immediate fact is that Zo only requires good approximation to the 
zeroth-order component of the nonlinear bispectrum, which simpli- 
fies theoretical development. 

At a first glance it seems that it is not useful to invoke Zo, 
since Z contains more information, the former erases the angular 
dependence completely through averaging. However, by smooth- 
ing Zo suffers much less from shot-noise than Z, i.e. has smaller 
variance, which is a celebrated property particularly when sample 
is not of high number density. More important, as we will see in 
next section, Zo can be easily estimated with the common proce- 
dure for anisotropic 2PCF after some minor modification. The sav- 
ings in computing time, proportional to the number of galaxies, is 
tremendous compared with calculating the projected 3PCF of the 
full configuration. 



3 ESTIMATION AND NUMERICAL TEST 
3.1 Estimator 

Estimation of the zeroth-order component of the projected 3PCF is 
based on Eqs.|7]and[8] Eq.[8]indi cates that Co can be me asured with 
the same estimator of C o as in IPan & Szapudi f2 005b). taking the 
same form of the one in Szapudi & Szalav ( 1998), 



~ DDD - 3DDR + 3DRR - RRR 



Co 



(11) 



RRR 

grouped symbols of D and 7? refer t o various norma l ized nu mber 
counts of triplets similar to what is in IPan & Szap udi ( 2005b]), dif- 
ference is that Co is estimated in bins of both a and tt. Explicitly, 
if scale bins are linear, given two vector bins rj k = {<Jj, n k ) and 
fj'fe' = (o-j>, iry) , with a jk in (a ]k - Acr/2, a jk + Aa/2), and 
7Tjfc in (iijk — An/2, njk + An/ 2), as an example, the DDD is 
obtained through 



DDD = i N g{ N g -l)(N g -2) 

£j = l n i( r jfc)"i(Vfc') 
N g (N g -l)(N g -2) ' 



if v jk = r fk , 



if r jk / r 



j'k' 



(12) 

where m is the number of neighbours to the center point counted 
in the vector bin r 3 -&. Then by Eq. [7] integrating Co over n k and 
n k i yields estimation of Zo- We have to address here that unlike 
3PCF, the estimator can not completely eliminate edge effects for 
Co, Co an d so Zo, one needs to be cautious when scales at probe is 
comparable to sample's characteristic size. 



3.2 Data preparation and estimation setup 

Since our goal is to provide a redshift distortion free third-order 
statistics, a key question is whether Zo measured in redshift space 



agrees with what we get in real space. In absence of accurate mod- 
els about redshift distorted 3PCF, particularly in nonlinear regime, 
the best approach is to work with N-body simulation data di- 
rectly. Two rea lizations of LCDM simulations run with Gadget2 
(Springel 2005) were an alysed. Their cosmo logical parameters are 
taken from WMAP3 fits JSpergel et all2007l) . fi m = 0.236, SI a = 
0.764, h = 0.73 and a 8 = 0.74. 512 3 particles were evolved in 
both simulations, but one box size is L = 300/i _1 Mpc (box300) 
and the other is L = 600/i~ 1 Mpc (box600), the force softening 
lengths are 12fe _1 kpc and 24/i -1 kpc respectively. The z = out- 
put of box300 simulation and z — 0.09855 output of box600 sim- 
ulation were selected for our numerical experiment. 

It is unpractical to use all particles in the simulations therefore 
for each set of data we generate nine diluted samples for analysis 
to control the amount of computation at a reasonable level; all re- 
sults we present here are mean values of nine runs, and the actual 
scatter of different realizations is very small. For box300 the num- 
ber of randomly picked points is about 0.2% of the total, while for 
box600 more than ~ 600, 000 points are used. Several other sam- 
ples diluted at different levels were also generated for consistency 
check. We find that sample dilution does affect our estimation of 
Zo but mainly on very small scales, and that variance due to dis- 
creteness becomes larger with fewer points, as expected. 

A common assumption about redshift distortions is the plane 
parallel approximation (distant observer assumption), which as- 
sumes that the observer is very far away from the sample so 
that all lines-of-sight from the observer to galaxies are parallel 
to each other. It simplifies calculation by reducing a 3D problem 
into ID and indeed works well when the interested scale opens 
only a narrow angle to the observer. But the systematic bias in- 
troduced by the plane parallel approximation turns out to be sig- 
nificant if the angle becomes wide. Theoretical calculations and 
numerical measurements have shown that the deviation mainly oc- 
curs at relatively l arge scales and could be more than 10% (e.g. 
ISzalay et al.lll998l;IScoccimarrdl2000l ; ISzapudill2004bl ; ICai & Pan! 
2007*; IPapai & S zapudi 2 0081) . To test the accuracy of the plane 
parallel approximation, two sets of samples in redshift space are 
generated for the box300 data, one set takes the distant observer 
assumption while the other mimics realistic samples by placing an 
observer at distance of 100ft _1 Mpc to the nearest surface of the 
sample. One has to bear in mind that the two redshift distortion 
scenarios differ not only in sample construction but also the way of 
decomposing the separation r into (a, n) amid measurement. The 
output of box600 simulation we used is of z ~ 0.1, the plane paral- 
lel approximation is sufficient for most analysis if interested scales 
are less than ~ 50/i _1 Mpc. We also noticed that applying periodic 
boundary condition or simply throwing away those points shifted 
out of box by peculiar velocity makes little diffe rence for the fina l 
Zo, which effectively eliminates the concern of lNock et all d20ld) . 

During our estimation the a bins are set logarithmic with 
A log a — log 1.4, 7T bins are linear with A-7r = 3, 5h~ Mpc for 
the box300 and the box600 respectively. Caution must be taken 
about the bin width which shall not be too wide to degrade the 
accuracy too much, while it shall be large enough to achieve 
DDD > for even the narrowest bin. Experience shows that nor- 
mally DDD > 100 is good enough to give reliable estimation at 
our accuracy goal of 10 %. 

As we do not have multiple realizations to produce error bars, 
for each simulation data set we split the sample into eight sub- 
volume boxes in half size, then the scatter of measurements in these 
eight sub-volume boxes are taken as an estimate of the variance. 
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Figure 1. Fractional change of Zo(ai, 02) with n m ax compared to the one measured with largest ir max . Three classes of configuration of 02/01 of Zo 
are presented. Zor measured in real space with n m ax as specified in the legend, Zormax is estimated with the largest n m ax = 150/i -1 Mpc. Plane parallel 
approximation in real space means merely the decomposition of r into (<r, n) using parallel LOS, while nonparallel corresponds to an external observer at the 
distance of 100/i~ 1 Mpc to the bottom of the simulation box. Errorbars of n m ax = 78h~ 1 Mpc for box300, n m ax = 120/i Mpc for box600 are plotted to 
show the uncertainty. 



3.3 Finite integration range along LOS 

The integration range along LOS to have Zo from (0 in Eq. {7} 
should be (—00, +00) to guarantee conservation of triplets along 
LOS. However, one can not integrate infinite scales due to finite 
radial thickness of realistic samples, so there is always a finite upper 
limit of n to the integration. Our measurements thus correspond to 

/ Cod7rid7T 2 = ^CoAtt.Att, . (13) 

Let subscript r denote quantities in real space and s for those in 
redshift space. The practical limitation certainly introduces system- 
atic bias, henceforth mathematically Zq s 7^ Zo T 7^ Zq. What we 
hope is that we can find a n m ax so that the contribution from n 
larger than that is negligible at a given tolerance. In our test runs 
we found that the largest n max permitted is around 1/4 — 1/2 
of the box size. If larger scale is used, the estimator of Zo suf- 
fers greatly of finite- volume effects. The same problem is present 
when estimating projected 2PCF, and normally it is agreed that 
7T mal ~ 40 — 70/i _1 Mpc is sufficient to give stable results at small 
a of less than ~ 20 — 30fe -1 Mp c, but may not be e nough for mea- 
surement on larger scales (see lBaldauf et alj|20ld and references 
there in). 

Figure [TJ presents the convergence of measurements with 
changing Tv max . It displays the fractional differences of Zo com- 
pared to that calculated from the largest nmax allowed by the ge- 
ometry of sample. Samples used in this test are all in real space, but 
for box300 data we decompose scales by LOS in two ways: plane 
parallel approximation and wide angle treatment. 

For box300 at scales a < l/i _1 Mpc Zo is extremely sta- 
ble against different choices of n max , independent of the scheme 
of scale decomposition, but at larger a scales the influence of 
those large n becomes more and more evident. It appears that in 



the wide angle treatment Zo actually increases with ir ma x up to 
~ 110/i~ Mpc and then falls down when further enlarging n ma x, 
while in the plane parallel assumption Zo monotonically rises with 
larger nmax- The results from box600 are somewhat different: Zo 
decreases with increasing n max on all a scales. Additional numeri- 
cal experiments with the box600 data revealed that this behaviour is 
largely caused by the dilution of the original data: a denser sample 
has less variation against the choice of n ma x when a < l/i _1 Mpc. 

We conclude that for an overall precision target of ~ 10% for 
a scales below 10ft -1 Mpc, n m ax ~ 120/i _1 Mpc suffices. This is 
much larger than customary for the projected 2PCF. Note that the 
sharp break down of convergence at scales a ~ 10 — 20/i~ 1 Mpc 
appears to be a numerical artifacts where Zo quickly approaches 
zero. 



3.4 Redshift space versus real space 

Figuref2]demonstrates Zo of the two simulation data sets in redshift 
space and real space for different a%ja\ and nmax', detailed com- 
parison is drawn in Figure[3] On most scales of a < 10/i _1 Mpc, 
residual effects of redshift distortion due to finite integration do- 
main result in only a minor bias within 10%, except for an upshot 
in Zo in redshift space on scales of a <~ 0.2/i _1 Mpc. Adjusting 
n ma x does not modulate Zo significantly on a <~ l/i _1 Mpc, 
but causes some apparent deviations on larger scales, especially 
where Zo approaches its zero-crossing point. Nevertheless, it is re- 
assuring from Figures [2] and [3] that Zo estimated in redshift space 
agrees well with that of real space with at most 10% uncertainty 
for 0.2 < o <~ 10ft _1 Mpc and n max ~ 120/i _1 Mpc. Thus Zo 
can be accepted as a redshift distortion free third order statistics to 
a good precision. 
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Figure 2. Zq measured in real space (thick lines) and that in redshift space (thin lines) for different ail^X an d ^max- Errorbars are of measurements of 
configuration a%ja\ = 1 in real space. The results not shown for box300 under plane parallel approximation are similar to the non-parallel case. 




Figure 3. Deviation of Zq in redshift space to Zq in real space for different configurations of o^/ct! and choices of ir ma x - Left panel shows the wide angle 
treatment to the redshift distortion to the box300 data, the middle shows the box300 results under plane parallel assumption, and the right panel is shows 
box600 with plane parallel redshift distortion. 



4 HALO MODEL PREDICTION OF Z 
4.1 Formalism 

The halo model invoked to model t he third-or d er sta tis- 
tics Z of dark matter b asically follows iMa & Fry J2000allbh. 
IScoccimarro et alj fcOOll) . iFosalba et al] d2005h and ISmith et all 
d2008h . Here we just give a brief description of main ingredi- 
ents of the model, fo r more details we refer to the review of 
ICoorav&Shetr]d2002h . 

(i) Halo profile p(r). It has been pointed out that the density 
profile of a virialized dark matter halo in general is ellipsoidal and 



shows various morphology rather than a simple universal spherical 
approximation dJing & Sutol' 2000. 2002). The non-spherical shape 
of halo can evidently affect the halo model prediction of the clus- 
tering of dark matter on small scales where the one-halo term dom- 
inates ( Smi th et alj2006h . Noting that Zq is a degenerated 3PCF in 
analogous to C,q, and shou ld be similarly inse nsitive to halo shapes, 
the popular NFW profile dNavarro et al.lll997h is still adequate for 
our model. For a halo of mass M it reads, 



u(r) = PM = Jl i 



(14) 



where / = 



|Tn(l + c)-c/(l + c)] -1 and c(M) 
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co(M/Mt) _/3 known as the concentration parameter; parame- 
ters Co = 9, P = 0.13 are calibrated by numerical sim- 
ulation iBull ocketail koon. Halo mass is defined as M = 
(4ttRI/3) Ap cr it with pcrit being the cosmological critical den- 
sity. A is the density contrast for virialization and can be estimated 
from spherical collapse model. A good fit for a flat unive rse with 
cosmological constant is given bv lBrvan & Normal] d 1998h 



18tt 2 + 83x - 39x 2 



(15) 



As it is more convenient to wor k in Fourier space for 3PC F, the 
Fourier transformed halo profile of lScoccimarro et alj d200lh is the 
used for computations 



u(k, M) =/ sin»j [Si(7?[l + V \) - Sifa)] 



+ co S7? [Ci(7 ? [l + 7 ? ])-Ci(r ? )]- 



sm rjc 
+ c) 



(16) 



with r\ = kR v /c. Note that the halo profile is presumably truncated 
at virial radius R v in the standard version of halo model for large 
scale structure; in extensions it becomes an adjustable parameter in 
an attempt to find the best match to simulations. 

(ii) Mass function n(M). There are many versions of halo mass 
functions, but it turns out that using mass function with higher pre- 
cision actually brings about only a relatively minor change to Zo on 
the small scales as we t ested. The classical Sheth-Tormen function 
(Shet h&Tormenll 19991) is sufficient, 



n{M)MAM = p^A^ ^ (l + g{v)^) exp(-^) 

"" (17) 
in which 7 = d In a 2 M /d In R, g(u) = civ 2 , v = S c /(tm, 

y - (M /M,) 1/s = R/R, with i?* = R V A and a M being the ex- 
trapolated linear variance of the dark matter fluctuations smoothed 
over the Lagrangian scale R. A = 0.322 a = . 707, p = 0.3 are 
parameters fitted to simulations bv ljenkins et al.l teOOlh . 

Note that the definition of halo mass in the Sheth-Tormen func- 
tion is M = A-jrpARl/S with p = 2.78 x lO 11 fi m /i 2 M Mpc- 3 
being the dark matter density of the present universe, while the 
halo mass in in the NFW profile is defined with the critical mass 
Peru = p/£lm , conversion be t ween h alo parameters of the two 
sets is given in [Smith & Watts (2005). IScoccimarro et ID booih 
already noticed the inconsistency but argued that effects of the dif- 
ference could be la rgely cancelled in practical calculations, and 
iFosalba et alj d2005h also find that changing the concentration pa- 
rameter by as much as 50% would not affect the final results sig- 
nificantly. This is also the case in our calculation. 

(iii) Halo bias. The distribution of massive halos is biased 
to the dark matter. Most halo bias functions extracted from N- 
body simulations (e.g.lMo et al.ll 19971 ; [jingll 1 9991 ; I Sheth & Tormerj 



1999l;lTinker et al^2010l) are refinements to the analytical model of 



Mo & White ( 1996). The bias plan used in our recipe is the fitting 



formula given bv lSheth & TormerJ [l999), 



■2p 



5c{l+g{vY) 



(18) 



in which 8 C is the linear overdensity threshold for spherical col- 
lapse. Its cosmological dependence is so weak that a c onstant value 
of 1.6 86 is usually taken. The most recent update of iTinker et al.1 
d2010h is also applied in our code for a consistency check, and the 
results indicate that the improvement to Zo is minor in the interme- 
diate nonlinear regime only; it does not bring significant improve- 



ment to the overall accuracy when considering the magnitude of 
numerical errors of estimation of the previous section. 

To prevent multi-dimensional inte grations involved in d irect 
calculation of £ in configuration space dTakada & Jainll2003l) . we 
work in Fourier space to yield the bispectrum B predicted by halo 
model first. Then Bo is easily obtained to render Zq through the 
transformation of Eq.[l0] In the halo model, bispectrum consists of 
three separate terms, namely the one-halo, two-halo and three-halo 
terms, 



in which 



B(fti,fa,0) = B lh + B 2h + B 3h , 

B lh = Io3{ki,k 2 ,k 3 ) 
B 2 h = hi{k 1 )h 2 {k2, k a )P L (ki) + eye. 
3 

PT 



(19) 



(20) 



and 



( 4nr' c 



dv 

—n(r)bi (r) [u(ki ,r) . . .u(kj,r)] , 



3 \ i- 1 



(21) 



with bo = 1, 61 = b(u) and bi = for i > 1 to neglect quadratic 
and high order biasing terms. Pl is the linear power spec trum and 
it is generated by CMBFAST dSeliak & ZaldarriagJl99'6T) with the 
cosmological parameters from the simulations we use. Bpt is the 
bispectru m predicted by the Eul erian perturbation theory at tree 
level (e.g. Be rnardeau et alj|2002h . 



4.2 Comparison with simulations 

Zq predicted by halo model and Eulerian perturbation theory is 
demonstrated in Figures [4] and [5] overlaid with measurements of 
the box300 and the box600 simulation data, respectively, estimated 
in real space. Results of our halo model and simulations agree re- 
markably well at both redshifts z = 0, 0.1, especially on a\ scales 
between ~ 0.2— ~ 5/i _1 Mpc. 

On very small scales o\ <~ 0.2/i _1 Mpc, the halo model pre- 
dicted Zq, is larger and steeper than the simulation. This is more 
apparent for box600. Numerical tests reveal that this is partly due 
to dilution to the original data set: a higher density of points leads 
a higher clustering power in this regime. On large scales, where 
halo model follows perturbation theory, both theories begin to over- 
predict the clustering strength of simulations for larger CT-2/cri, 
which should not be attributed to the imperfection of halo mod- 
els and should be the ina ccuracy of Bpt on these scales jPan et all 
l2007l;lGuo& Jing|l2009h . 

To improve halo model performance at the three-point level, 
halo boundary and mass function adjus t ments are usually ad opted 
dTakada & Jair]|2003l; IWang et alj|2004l ; IFosalba et alj|2005l) . This 
alleviates the disagreement to some extent. Here we also enlarge 
halo boundary beyond R v and truncate the high mass tail of halo 
mass function (Figures|4]and[5]l. Experiment indicates that this ex- 
tension of halo radius without a hard cut-off of the mass function 
can easily generate the correct shape and amplitude of Zo of simu- 
lations. Simple fitting shows that best halo boundary is ~ 1.5R V for 
box300 and ~ 1.6R V for box600. In contrast, if we keep the halo 
boundary unchanged but truncate the halo mass function, the one- 
and two-halo terms are so strongly modified, and the shape and the 
amplitude of Zo deviate from simulations significantly. Simple fit- 
ting to simulations by setting both halo boundary and mass cut-off 
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Figure 4. Zo of the box300 simulation (z=0) with predictions from halo model and Eulerian perturbation theory (PT). Symbols are measurements of simula- 
tions in real space with different 7r ma2 , . The upper row of plots shows the effects of adjusting halo boundary radius in unit of R v but without a cut-off to the 
halo mass function, while the bottom row demonstrates the consequence of cutting the high mass tails of the mass function with the halo boundary fixed to 



free reveals that mass cut-off could not be smaller than 10 15 Mq. 
Otherwise, there is no way to reconcile the under-predicted Zo with 
simulations at transition scales of a\ ~ 3ft _1 Mpc, above which 
Bpt breaks. In conclusion, enlarging halo boundary alone is suffi- 
cient for for accurately predicting Zo . 

During our calculation, we also examine the influence of the 
halo bias functio n and the mass fun ction by using the high pre- 
cision formula of iTinker et ail d2010h . Such replacement does not 
cause a fundamental change to the theoretical prediction (Figure[6]l. 
The new mass functions do not benefit the halo model much. On 
most a scales, less than 10~ Mpc, the halo model with Sheth- 
Tormen functions is consistent with simulations within our error 
budget of ~ 10 — 20%. The replacement of functions provided by 
ITinker et all d2010h increases deviation level to around 20 — 30%, 
especially on scales of ~ l/i _1 Mpc; visible advantage only just 
appears on scales of cxi >~ 3/i~ 1 Mpc with accuracy gain of a few 
percents. 

In addition to the halo mode l, we also checked the 
phenomenological models of IScoccimarro & Couchmanl 



d200ll) and 



Pan ct al 



] J200' 



The accuracy 
by Scoccimarro & Couchmanl d200ll) is very 



of the formula 
good on scales 

<7i >~ l/i _1 Mpc but then deviates from the sim ulations by more 
than 40% on smaller scales. The performance of |Pan et"al] d2007h 
is poor in terms of Zo as the bispectrum model is not designed 
to conserve clustering power and the resulting integration over it 
yields incorrect amplitude. Nevertheless, if a renormalization is 
enforced for the model to be consistent with the perturbation theory 
on large a, the model works well for Zo at o\ >~ 2/i~ 1 Mpc. 



5 SUMMARY AND DISCUSSION 

In this paper we propose a third-order correlation function for char- 
acterising galaxy clustering properties. The statistics Zo we advo- 
cate is the zeroth-order component of the projected 3PCF. Although 
Zo is a 3PCF, its estimation takes roughly the same amount of com- 
puting operation as the projected 2PCF. The algorithm can be eas- 
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Figure 5. Zo of the box600 simulation data and models. See last figure for details. 



ily implemented after moderate modification of a code for the pro- 
jected 2PCF. 

Various numerical experiments confirm that Zo can be deemed 
to be redshift distortion free within approximately 10% for the 
regime where the scale perpendicular to LOS is 0.2 < a < 
Wh~ Mpc. In addition, the maximal integration scale ir ma x 
parallel to LOS during estimation ought to be greater than ~ 
120/i _1 Mpc. A serious concern is that shot noise could ruin the 
estimation in the strongly nonlinear regime if the number density 
of points in a sample is too low. This requirement for a robust Zo 
measurement is tighter than for the projected 2PCF, but still weaker 
than the normal projected 3PCF, since Zo is an integral of the for- 
mer. The criterion we suggest is DDD >~ 100. 

As we expected, the halo model provides satisfactory predic- 
tion to dark matter Zo of simulations within ~ 10%, if the classical 
Sheth-Tormen mass functions are used. Our computation indicates 
that extending the halo boundary is enough to yield good fit to sim- 
ulations, while a hard cut-off to mass function is not as effective 
as previous works claimed. Substituting new functions of the halo 
mass distribution and halo biasing in high precision does not lead to 
significantly better agreement with simulations. Since the angular 



dependence in the projected 3PCF and the normal 3PCF is smeared 
out in Zo, we conjecture that using an anisotropic halo profile prob- 
ably will not significantly improve accuracy. A significant bias of 
halo model predicted Zo compared to simulations emerges in the 
weakly nonlinear regime, where halo models boil down to second- 
order perturbation theory; the latter is already known to be poor 
in predicting dark matter 3PCF. A more precise bispectrum from 
higher orde r perturbation theories may offer a way to increas e pre- 
cision (e.g. lValageal2008l ; ISefus"attill2009l;lBartolo et alj|2010h . 

The principal reason for proposing Zo is to provide an effi- 
cient redshift distortion free 3PCF, complementary to the standard 
projected 2PCF, for galaxy clustering analyses. It is well known 
that the projected 2PCF itself is a Gaussian statistic only and thus 
has its limitations. Third order correlation functions, mainly carry- 
ing information about non-Gaussianity, are more sensitive to details 
of the galaxy distribution. Non-Gaussianity of galaxy distribution 
is generated by the nonlinear action of gravitational force and gas 
physics if the primordial density fluctuation of the universe after in- 
flation is Gaus sian. The degeneracy shown in projected 2PCF (e.g. 
IZu et al. 2008) may be broken if third order correlation functions 
are employed. The redshift distortion free feature of Zo on scales 
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Figure 6. Relative differences of the halo model predicted Zo to that 
of estimated from the box300 simulation. Red dotted lines correspond to 
the predictio n of the halo model with halo mass function and bias func- 
tion given by Sheth & Tormen ( 1999), blue dashed lines are generated by 
the mo del with mass function and bias function provided by Tink er et alj 
simulation is the estimation from the box300 data with plane- 
parallel assumption to the redshift distortions. 



less than 10ft _1 Mpc defines its potential in investigating the rela- 
tion of galaxies with their host halos, and the formation histories 
of galaxies and halos. Furthermore, the success of halo model pre- 
diction on dark matter Zo encourages us to apply Zo for analysing 
galaxies. In principle, with measurements from galaxy samples, Zo 
enables us to generalize and diagnose schemes of HOD, conditional 
luminosity f unction (CLF , I Yang et al]|2003l) and semi-analytical 
models (e.g. Baugb] |2006h to third order statistics at cost of one 
additional free parameter, the halo boundary. Our present work is 
restricted to dark matter only, the behavior of Zo for biased objects 
remains unclear. Testing with mock galaxy samples before apply- 
ing to real data will be necessary. 
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